Metabarcoding of bacteria and parasites in the gut of Apodemus agrarius

Background The striped field mouse Apodemus agrarius is a wild rodent commonly found in fields in Korea. It is a known carrier of various pathogens. Amplicon-based next-generation sequencing (NGS) targeting the 16S ribosomal RNA (rRNA) gene is the most common technique used to analyze the bacterial microbiome. Although many bacterial microbiome analyses have been attempted using feces of wild animals, only a few studies have used NGS to screen for parasites. This study aimed to rapidly detect bacterial, fungal and parasitic pathogens in the guts of A. agrarius using NGS-based metabarcoding analysis. Methods We conducted 18S/16S rDNA-targeted high-throughput sequencing on cecal samples collected from A. agrarius (n = 48) trapped in May and October 2017. Taxa of protozoa, fungi, helminths and bacteria in the cecal content were then identified. Results Among the protozoa identified, the most prevalent was Tritrichomonas sp., found in all of the cecal samples, followed by Monocercomonas sp. (95.8% prevalence; in 46/48 samples) and Giardia sp. (75% prevalence; in 36/48 samples). For helminths, Heligmosomoides sp. was the most common, found in 85.4% (41/48) of samples, followed by Hymenolepis sp. (10.4%; 5/48) and Syphacia sp. (25%; 12/48). The 16S rRNA gene analysis showed that the microbial composition of the cecal samples changed by season (P = 0.005), with the linear discriminant analysis effect size showing that in the spring Escherichia coli and Lactobacillus murinus were more abundant and Helicobacter rodentium was less abundant. Helicobacter japonicus was more abundant and Prevotella_uc was less abundant in males. The microbial composition changed based on the Heligmosomoides sp. infection status (P = 0.019); specifically, Lactobacillus gasseri and Lactobacillus intestinalis were more abundant in the Heligmosomoides sp.-positive group than in the Heligmosomoides sp.-negative group. Conclusions This study demonstrated that bacterial abundance changed based on the season and specific parasitic infection status of the trapped mice. These results highlight the advantages of NGS technology in monitoring zoonotic disease reservoirs. Graphical Abstract Supplementary Information The online version contains supplementary material available at 10.1186/s13071-022-05608-w.

pathogen screening method for the surveillance of infected animals.
The striped field mouse, Apodemus agrarius, is a wild rodent that is commonly found in fields in the Republic of Korea [4]. This wild rodent can spread a great number of infectious bacteria and parasites through its feces [1,5]. Also, rodents live in a wide variety of habitats, including agricultural regions and man-made environments [3], and they have high reproductive rates, which is a well-known characteristic of r-selected species. Owing to these specific characteristics, wild rodents are considered to be one of the most dangerous reservoirs of infectious organisms among wild animals.
Molecular, serological and microscopic methods, such as the PCR and enzyme-linked immunosorbent assay (ELISA), have been used to detect pathogens in wild rodent samples. For example, Orientia tsutsugamushi, Anaplasma phagocytophilum and Leptospira interrogans were identified in the spleens and the blood of striped field mice [6]. In addition, zoonotic helminths, such as Hymenolepis diminuta, were identified using light microscopy [7][8][9][10]. However, such conventional methods have limitations when screening a large number of samples for a variety of pathogens.
Recent studies have analyzed the microbiomes of feces of wild animals using next-generation sequencing (NGS) for a more integrated and rapid screening approach [11][12][13]. Amplicon-based NGS targeting the 16S rRNA gene is currently the most widely used technique to analyze the bacterial microbiome [14]. The NGS technique can be applied to detect veiled pathogens because of its untargeted nature and ability to investigate non-culturable organisms [15]. Although many bacterial microbiome analyses have been attempted using the feces of wild animals [16], only a few studies have used NGS to screen feces for parasites. [17][18][19]. Therefore, we decided to use NGS to detect various kinds of pathogens.
In this study, we used 18S ribosomal RNA (rRNA) gene amplicon-based NGS to screen for fungi and parasites and 16S rRNA gene amplicon-based NGS to screen for bacteria in the gut of A. agrarius. To the best of our knowledge, this is the first such study to be carried out in the Republic of Korea. A number of earlier studies have revealed the interaction between parasites and host microbiota [20][21][22][23]. Interestingly, some parasites need alterations in the host microbiota to promote their successful survival and control of parasite numbers [24,25]. The host microbiota has also been shown to function as a resistance factor for parasitic infection [26]. Hence, we compared differences in the microbial composition of hosts based on their parasite infection status. In addition, seasonal variation could affect the host's food availability, resulting in microbial differences and affecting the chance of infection of a parasite [27,28]. Thus, we also compared the microbial composition based on seasonality. Cecal contents were used because these contain assorted organisms, including pathogens, and are appropriate study material for analyzing the interaction between parasites and the bacterial microbiome.

Sample collection
In total, 48 striped field mice (A. agrarius) were captured using Sherman Live Traps (H. B. Sherman Traps, Inc., Tallahassee, FL, USA) from Gangneung and Wonju, Gangwon-do, Korea in May and October 2017. Detailed information on the captured wild rodents used in this study is included in Additional file 1: Table S1. Traps were opened after 24 h; all mice were alive when the traps were opened and subsequently euthanized on the same day using a CO 2 chamber. They were immediately dissected, and the ceca were collected and stored at -70 °C until use. At 6 months after collection, the cecal contents were extracted from the cecal lumen using disposable sterile bacterial spreaders. Cecal DNA was extracted using the Fast DNA SPIN Kit for Soil (MP Biomedicals, Carlsbad, CA, USA) according to the manufacturer's protocol. The DNA samples were stored at − 80 °C until needed.

Processing and bioinformatics of iSeq100 data
Geneious Prime ® 2022.0.2 (Biomatters Ltd., Auckland, New Zealand) was used to process and assemble raw 18S V9 reads as previously described [31,32]. Briefly, low-quality sequences (< Q25) were filtered using BBDuk (v38.84). The forward and reverse reads were merged to produce a single consensus sequence using BBMerge (v38.84) using the 'high rate' setting. Sequences of 120-260 bp in length were sorted. The UCHIME algorithm was used to detect and remove chimeric sequences [33]. Closely related sequences were clustered into separate operational taxonomic units (OTUs) using de novo assembly and the default setting, which is a "Minimum Overlap Identity" of 98%. To create a sequence classifier database, the OTUs were aligned via sequence clustering using the Basic Local Alignment Search Tool (BLAST) and the NCBI "nt" GenBank database (November 2021 version). Then, the full sequences of BLAST hits from the NCBI were downloaded, and only the regions of the BLAST hits were extracted in order to create the sequence classifier database. The Geneious Sequence Classifier plugin was used to classify the merged amplicon dataset, using the created sequence classifier database. The 'very high sensitivity/slow' mode was used, with a minimum overlap of 90 bp. The sequences in the database that showed the highest homology were selected in the final taxonomic identification result [34].
Bacterial microbiome analysis of the 16S rRNA gene sequence data was performed using EzBioCloud, a commercially available ChunLab bioinformatics cloud platform for microbiome research (https:// www. ezbio cloud. net/). Bioinformatic analyses were performed as previously described [35,36]. Briefly, raw reads were quality checked, and low-quality (< Q25) reads were filtered using Trimmomatic 0.32 [37]. Paired-end sequence data were then merged using PandaSeq [38]. Primers were then trimmed using the ChunLab in-house program (ChunLab, Inc., Seoul, Korea), which applied a similarity cut-off of 0.8. Sequences were denoised using the Mothur pre-clustering program, which merges sequences and extracts unique sequences, allowing up to two differences between sequences [38]. The EzBioCloud database (https:// www. ezbio cloud. net/) [36] was used for taxonomic assignment using BLAST 2.2.22, and pairwise alignments were generated for similarity calculations [39,40]. The UCHIME algorithm and non-chimeric 16S rRNA database from EzTaxon were used to detect chimeric sequences for reads with a best-hit similarity rate of < 97% [33] (a 97% similarity is generally used as the cutoff for species-level identification). Sequence data were then clustered using CD-Hit and UCLUST [41,41]. All subsequent analyses were performed using EzBioCloud.
Rarefaction for the obtained OTUs was calculated using the ChunLab pipeline, in accordance with a previous protocol [42]. The reads were normalized to 8000 for diversity analyses. We computed the Shannon index [43] and performed principal coordinates analysis (PCoA) [44], permutational multivariate analysis of variance (PERMANOVA) [45] and permutational multivariate analysis of dispersion (PERMDISP) [46] based on the generalized Bray-Curtis distance. The PERMANOVA and PERMDISP tests were used to assess the differences in the microbial community structure based on various factors, including season and parasitic infection status. We used the Wilcoxon rank-sum test to test for differences in the number of OTUs and Shannon index to compare microbiome diversity between the groups separated based on season and parasite infection status. Linear discriminant analysis (LDA) effect size (LEfSe) analysis was used to identify significantly different taxa between the groups [47]. In addition, we used the theoretical framework of a previous study to investigate the impacts (synergistic, neutral or antagonistic) of parasitic co-infection on bacterial diversity change when mice were infected by multiple parasites [48].

Eukaryotic organisms in the A. agrarius gut
The average (± standard deviation [SD]) number of assigned read counts was 34,957 ± 19,899. The maximum and minimum number of reads were 88,107 and 2128, respectively. These reads included only protozoa, helminths and fungi, as the average host reads (221 ± 203) were removed before analysis. The relative abundances of fungi, protozoa and helminths in individual A. agrarius animals are shown in Fig. 1a. The relative abundances of protozoal taxa were greater than those of fungi and helminths in all of the A. agrarius samples except for one. All cecal samples were infected with Tritrichomonas sp. (100%, 48/48) followed in order of prevalence of infection by Monocercomonas sp. (95.8%; 46/48) and Giardia sp. (75%; 36/48, Fig. 1b). Isospora sp. were found in six samples, Cryptosporidium sp. were found in five samples and Blastocystis sp. were found in one sample. In addition, Entamoeba sp., Spironucleus sp. and Retortamonas sp. were identified.
In this study, 27 of the 48 cecal samples were found to contain fungi, among which Kazachstania sp. was the most dominant species (Fig. 1c). Mucor sp. were found in two samples and Candida sp., Rhizopus sp., Cladosporium sp. and Periconia sp. were found in one sample (Fig. 1c).

Bacteria in the A. agrarius gut
High-throughput sequencing of the 16S rRNA gene of 48 cecal content samples of A. agrarius using the iSeq 100 system produced an average (± SD) number of assigned read counts of 27,697 ± 14,281. The relative abundances of bacterial taxa in the cecal microbiomes of individual wild rodents are shown in Fig. 2. At the family level, all samples were dominated by the presence of Muribaculaceae (relative abundance: 9.32-57.13%; average abundance: 26.71%), followed in order of relative abundance by Lachnospiraceae (relative abundance: 3.95-61.59%; average abundance: 16.83%), which was also detected in all samples. Bacterial OTUs at all the taxonomic levels are provided in Additional file 2: Table S2. Helicobacter rodentium and Helicobacter aurati were detected in 100% (48/48) and 72.9% (38/48) of samples, respectively. Helicobacter fennelliae was found in one sample.

Bacterial microbiome differences based on the season
The number of OTUs did not differ between the mice caught in spring (n = 25 mice; median OTUs: 941) and those caught in fall (n = 23 mice; median OTUs: 808; P = 0.337) (Fig. 3a). The Shannon index also did not differ between the mice caught in the spring (median Shannon index: 4.70) and those caught in the fall (median Shannon index: 4.68; P = 0.657 (Fig. 3b). However, the PCoA and PERMANOVA showed that the gut bacterial composition of mice caught in the spring and fall did differ significantly (PERMANOVA: F = 1.805, R 2 = 0.042, P = 0.005; PERMDISP: F = 0.04, R 2 = 0.0009, P= 0.83) (Fig. 3c). Escherichia coli (LDA score: 4.057) and Lactobacillus murinus (LDA score: 3.529) were more abundant in the mice collected in the spring, but Helicobacter rodentium (LDA score: 3.773) was less abundant in mice collected in the spring than in those collected in the fall (Fig. 3d).  Fig. 4c). Interestingly, Lactobacillus gasseri (LDA score: 3.667) and Lactobacillus intestinalis (LDA score: 3.492) were more abundant in the Heligmosomoides sp.-positive group than in the Heligmosomoides sp.-negative group (Fig. 4d). We then tested the impact of Heligmosomoides sp. and Giardia sp. coinfection. Mono-infection with Heligmosomoides sp. did not alter the Shannon index (P = 0.874; Additional file 3: Fig. S1a), but the pair-wise PERMANOVA (Bray-Curtis distance) indicated significantly different microbial compositions between the Heligmosomoides sp. mono-infected and uninfected groups, and between the co-infected (Heligmosomoides sp. and Giardia sp.) and uninfected groups (P = 0.012 and P = 0.013, respectively). PERMANOVA did not indicate a significant difference between the mono-infected and co-infected groups (P = 0.251; Additional file 3: Fig. S1b).

Discussion
Wild rodents are likely to play roles as vital reservoirs of zoonotic pathogens, including bacteria, parasites and fungi [49,50]. Pathogens can be spread to humans via direct contact or through the ingestion of food and water contaminated with rodent feces [51]. In the present study, we comprehensively investigated the presence of prokaryotic and eukaryotic organisms in the gut of A. agrarius using metabarcoding and analyzed interactions among them.
Using the screening method described in the Methods section, i.e. the Illumina iSeq 100 system, we detected potential prokaryotic and eukaryotic pathogens. The metabarcoding method has many advantages over conventional methods, such as PCR and microscopic and culture-based screening. The metabarcoding technique can be applied when screening a large sample of a variety of pathogens at one time, thereby saving costs and decreasing the amount of labor and time required for the analysis. This technique can also be used to detect veiled pathogens because of its untargeted nature and ability to investigate non-culturable organisms, which are problematic to investigate using conventional screening methods.
Notably, this study identified various Helicobacter strains in feces of the collected wild A. agrarius (Additional file 2: Table S2). Many studies have demonstrated that wild rodents can be reservoirs of various Helicobacter strains [52][53][54]. Helicobacter fennelliae, which was identified in one sample in the present study, is known to cause gastroenteritis in immunocompromised humans [55]. Our results showed that A. agrarius is a repository of various Helicobacter strains, some of which may be pathogenic to humans. Serratia marcescens, known as an opportunistic pathogen, was also detected in 17 samples in the present study, and this species can cause severe symptoms in patients, such as sepsis, pneumonia and meningitis [56]. Leptospira interrogans, O. tsutsugamushi, and A. phagocytophilum are known as infectious pathogens and were previously reported to be present in the spleen, kidney and blood of A. agrarius at a prevalence of 4.92%, 17.6% and 19.1%, respectively; however, these species were not detected in the current study [7].
Similar to a previous study conducted in the UK, we found a lower relative abundance of Lactobacillus in the samples collected in the fall compared to those collected in the spring, whereas there was a higher relative abundance of Helicobacter in the fall [52].
Unlike bacterial community studies using the 16S rRNA gene, metabarcoding analysis targeting eukaryotic communities is still in its early stage. In the present study, we identified the infection status of parasites and fungi in the rodent gut through 18S rRNA gene amplicon sequencing. A previous study that analyzed the feces of seven Rattus norvegicus and two Rattus rattus demonstrated the strength of the metabarcoding method in comparison to microscopy [57]. In that previous study, all of the different helminths, such as Ascaridia and Hymenolepis, found using microscopy were also detected by the Illumina-based metabarcoding method [57]. In our study, we used the NCBI database as it contains a greater range of data than the SILVA database used in that previous study. For example, Heligmosomoides sp. was found in the NCBI database and not in the SILVA database. The samples tested in the current study were found to contain many of the parasites reported in previous publications to be present in wild mice [50,[58][59][60][61][62][63]. In our study, we identified Isospora sp., Cryptosporidium sp. and Blastocystis sp., all of which might include zoonotic agents. Cryptosporidium parvum, for example, is a zoonotic pathogen that causes diarrhea in humans [63]. We also identified potential fungal pathogenic agents, including Mucor sp. and Rhizopus sp., which are major fungal pathogens that can cause mucormycosis in humans [64,65]. Cladosporium sp., Periconia sp., Candida sp. and Kazachstania sp. were also found, the presence of which were previously reported in human infection cases [66][67][68].
Interestingly, Hymenolepis sp. was only detected in mice collected in the spring. The authors of a previous study reported that the temperature and humidity conditions during the summer and fall seasons could be advantageous for Hymenolepis sp. infection in wild rodents [60]. In the present study, we detected Syphacia sp. in only 25% (12/48) of samples. A previous study reported that Syphacia sp. could be found in 14.0% of wild rodents. Albeit rare, Syphacia sp. can infect humans and is a zoonotic parasite [69,70]. Heligmosomoides sp. was the most prevalent infectious helminth in the present study (present in 85% of samples). A previous report suggested that intestinal helminth infections are more prevalent in Heligmosomoides sp.-infected wild mice than in their uninfected counterparts [61]. We detected Raillietina sp. in 8.3% of samples; this tapeworm was reported to have the highest prevalence of all potential zoonotic helminths infecting wild rodents in the Indochinese Peninsula [62]. We also found Strongyloides sp. in three samples. Strongyloides ratti is a skin-penetrating nematode and normally used as a laboratory model for Strongyloides stercoralis. Oscheius sp. and Panagrolaimus sp. have not been reported in wild rodents to date. Oscheius spp. was previously identified as an entomopathogenic nematode [71]. Panagrolaimus spp. is a free-living nematode that feeds on bacteria, and it has been isolated from soil [72].
Plagiorchis sp. was detected in three samples in the present study. Parasitic trematodes of the genus Plagiorchis have been reported to have zoonotic potential. Plagiorchis muris and Plagiorchis elegans have been known to cause intestinal infections in wild mice [73]. In addition, Plagiorchis sp. has been reported to cause intestinal infections in human patients in Japan and Korea [74]. In 2007 and 2014, P. muris was reported in A. agrarius in Korea (5.3% and 14.8%, respectively) [75,76]. In a study carried out in the UK, 717 P. elegans specimens were collected from the small intestines of 27 of 117 wood mouse (Apodemus sylvaticus) samples [77].
There was no difference in the alpha diversity between Heligmosomoides sp.-infected and Heligmosomoides sp.uninfected mice in the present study. This result agrees with those of Kreisinger et al. regarding the impact of helminth infections on microbial compositions [19]. In particular, we noted higher L. gasseri and L. intestinalis abundances in the Heligmosomoides sp.-infected group. A recent study demonstrated that Heligmosomoides sp. helminth infection alters the intestinal microbiota [4]. The results of other studies also confirm that the prevalence of Lactobacillus is increased in laboratory mice infected with various intestinal helminths [78][79][80].
When we analyzed the impact of Heligmosomoides sp. and Giardia sp. co-infection, we noted that co-infection did not cause any significantly different effects (neutral effect) compared to Heligmosomoides sp. mono-infection (Additional file 3: Fig. S1).
Interestingly, we were able to detect Heligmosomoides sp. genes in the ceca despite this parasite typically residing within the small intestine. This dection is due to the sensitivity of metabarcoding analysis to detect and identify gene sequences from small amounts of parasitic cells, tissues and eggs in the ceca.
In this study, parasitic worms or eggs were not collected and identified under a microscope. In addition, since this study was conducted using the Illumina iSeq 100 system, which covers short sequence lengths, there is a limitation in the resolution of accurate identification of the parasite species. For example, the 18S V9 regions of Heligmosomoides sp. and Nippostrongylus brasiliensis differ by 1 bp although all samples had a higher identity to Heligmosomoides sp. Metabarcoding using various primers that target different regions of 18S rRNA gene, such as V4 and V9, may produce more accurate metabarcoding information [81].
This study did not distinguish the fungi that reside within animal guts from those that are non-residents and ingested while feeding. Further research on this topic is needed to facilitate a more precise understanding of the causes and consequences of variations in wild animal gut fungi and parasites compositions [82]. We were unable to detect blood pathogens in this study due to the nature of cecal samples. Future studies will attempt to detect such pathogens from other organ tissues.

Conclusions
We screened bacteria, fungi, protozoa and helminths in the gut of A. agrarius using 16S and 18S rDNAtargeted high-throughput sequencing and identified potential zoonotic pathogens such as Cryptosporidium sp. and Hymenolepis sp. In addition, the bacterial composition was found to be changed based on the season and specific parasitic infection status of collected mice. This approach, with some improvements, will enable us to analyze a large number of samples in a high throughput manner and could be the next standard applied to investigate bacterial and parasitic infections.